Skip to content

fix: normalize response images missing index + guard audio duration o…#22955

Open
krrishdholakia wants to merge 9 commits intomainfrom
litellm_oss_staging_03_06_2026
Open

fix: normalize response images missing index + guard audio duration o…#22955
krrishdholakia wants to merge 9 commits intomainfrom
litellm_oss_staging_03_06_2026

Conversation

@krrishdholakia
Copy link
Member

…verflow (#22950)

When the messages or response JSON fields in spend logs are truncated before being written to the database, the truncation marker now includes a note explaining:

  • This is a DB storage safeguard
  • Full, untruncated data is still sent to logging callbacks (OTEL, Datadog, etc.)
  • The MAX_STRING_LENGTH_PROMPT_IN_DB env var can be used to increase the limit

Also emits a verbose_proxy_logger.info message when truncation occurs in the request body or response spend log paths.

Adds 3 new tests:

  • test_truncation_includes_db_safeguard_note
  • test_response_truncation_logs_info_message
  • test_request_body_truncation_logs_info_message
  • Fix admin viewer unable to see all organizations

The /organization/list endpoint only checked for PROXY_ADMIN role, causing PROXY_ADMIN_VIEW_ONLY users to fall into the else branch which restricts results to orgs the user is a member of. Use the existing _user_has_admin_view() helper to include both roles.

  • feat(ui): add Chat UI — ChatGPT-like interface with MCP tools and streaming (feat(ui): add Chat UI — ChatGPT-like interface with MCP tools and streaming #22937)

  • feat(ui): add chat message and conversation types

  • feat(ui): add useChatHistory hook for localStorage-backed conversations

  • feat(ui): add ConversationList sidebar component

  • feat(ui): add MCPConnectPicker for attaching MCP servers to chat

  • feat(ui): add ModelSelector dropdown for chat

  • feat(ui): add ChatInputBar with MCP tool attachment support

  • feat(ui): add MCPAppsPanel with list/detail view for MCP servers

  • feat(ui): add ChatMessages component; remove auto-scrollIntoView that caused scroll-lock bypass

  • feat(ui): add ChatPage — ChatGPT-like UI with scroll lock, MCP tools, streaming

  • feat(ui): add /chat route wired to ChatPage

  • feat(ui): remove chat from leftnav — chat accessible via navbar button

  • feat(ui): add Chat button to top navbar

  • feat(ui): add dismissible Chat UI announcement banner to Playground page

  • feat(proxy): add Chat UI link to Swagger description

  • feat(ui): add react-markdown and syntax-highlighter deps for chat UI

  • fix(ui): replace missing BorderOutlined import with inline stop icon div

  • fix(ui): apply remark-gfm plugin to ReactMarkdown for GFM support

  • fix(ui): remove unused isEvenRow variable in MCPAppsPanel

  • fix(ui): add ellipsis when truncating conversation title

  • fix(ui): wire search button to chats view; remove non-functional keyboard hint

  • fix(ui): use serverRootPath in navbar chat link for sub-path deployments

  • fix(ui): remove unused ChatInputBar and ModelSelector files

  • fix(ui): correct grid bottom-border condition for odd server count

  • fix(chat): move localStorage writes out of setConversations updater (React purity)

  • fix(chat): fix stale closure in handleEditAndResend - compute history before async state update

  • fix(chat): fix 4 issues in ChatMessages - array redaction, clipboard error, inline detection, remove unused ref

  • docs: add PayGo/priority cost tracking for Gemini Vertex AI

  • Add PayGo / Priority Cost Tracking section to Vertex AI provider docs
  • Document trafficType to service_tier mapping (ON_DEMAND_PRIORITY, FLEX, etc.)
  • Add service tier cost keys to custom pricing docs
  • Add provider-specific cost tracking note to spend tracking overview

Made-with: Cursor

  • fix: normalize response images missing index + guard audio duration overflow
  1. convert_dict_to_response.py ([Bug]: OpenRouter + google/gemini-3-pro-image-preview: ValidationError - missing index field in images array #22640): Providers like OpenRouter/Gemini return images without the required index field, causing pydantic ValidationError when constructing Message. Added _normalize_images() to backfill index from enumeration position.

  2. audio_utils/utils.py ([Bug]: Audio transcription can be massively overbilled when local soundfile fallback returns invalid duration #22622): libsndfile can report 2^63-1 frames for malformed audio files, causing astronomically large duration values used for cost calculation. Added guards for sentinel frame counts and implausible durations (>24h).

  • fix: add type annotations to _normalize_images + guard samplerate==0

Address review feedback:

  • Add type hints to _normalize_images() for consistency with codebase
  • Guard against samplerate <= 0 to prevent ZeroDivisionError on malformed audio files

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

…verflow (#22950)

* feat(spend-logs): add truncation note when error logs are truncated for DB storage (#22936)

When the messages or response JSON fields in spend logs are truncated
before being written to the database, the truncation marker now includes
a note explaining:
- This is a DB storage safeguard
- Full, untruncated data is still sent to logging callbacks (OTEL, Datadog, etc.)
- The MAX_STRING_LENGTH_PROMPT_IN_DB env var can be used to increase the limit

Also emits a verbose_proxy_logger.info message when truncation occurs in
the request body or response spend log paths.

Adds 3 new tests:
- test_truncation_includes_db_safeguard_note
- test_response_truncation_logs_info_message
- test_request_body_truncation_logs_info_message

Co-authored-by: Cursor Agent <cursoragent@cursor.com>

* Fix admin viewer unable to see all organizations

The /organization/list endpoint only checked for PROXY_ADMIN role,
causing PROXY_ADMIN_VIEW_ONLY users to fall into the else branch
which restricts results to orgs the user is a member of. Use the
existing _user_has_admin_view() helper to include both roles.

* feat(ui): add Chat UI — ChatGPT-like interface with MCP tools and streaming (#22937)

* feat(ui): add chat message and conversation types

* feat(ui): add useChatHistory hook for localStorage-backed conversations

* feat(ui): add ConversationList sidebar component

* feat(ui): add MCPConnectPicker for attaching MCP servers to chat

* feat(ui): add ModelSelector dropdown for chat

* feat(ui): add ChatInputBar with MCP tool attachment support

* feat(ui): add MCPAppsPanel with list/detail view for MCP servers

* feat(ui): add ChatMessages component; remove auto-scrollIntoView that caused scroll-lock bypass

* feat(ui): add ChatPage — ChatGPT-like UI with scroll lock, MCP tools, streaming

* feat(ui): add /chat route wired to ChatPage

* feat(ui): remove chat from leftnav — chat accessible via navbar button

* feat(ui): add Chat button to top navbar

* feat(ui): add dismissible Chat UI announcement banner to Playground page

* feat(proxy): add Chat UI link to Swagger description

* feat(ui): add react-markdown and syntax-highlighter deps for chat UI

* fix(ui): replace missing BorderOutlined import with inline stop icon div

* fix(ui): apply remark-gfm plugin to ReactMarkdown for GFM support

* fix(ui): remove unused isEvenRow variable in MCPAppsPanel

* fix(ui): add ellipsis when truncating conversation title

* fix(ui): wire search button to chats view; remove non-functional keyboard hint

* fix(ui): use serverRootPath in navbar chat link for sub-path deployments

* fix(ui): remove unused ChatInputBar and ModelSelector files

* fix(ui): correct grid bottom-border condition for odd server count

* fix(chat): move localStorage writes out of setConversations updater (React purity)

* fix(chat): fix stale closure in handleEditAndResend - compute history before async state update

* fix(chat): fix 4 issues in ChatMessages - array redaction, clipboard error, inline detection, remove unused ref

* docs: add PayGo/priority cost tracking for Gemini Vertex AI

- Add PayGo / Priority Cost Tracking section to Vertex AI provider docs
- Document trafficType to service_tier mapping (ON_DEMAND_PRIORITY, FLEX, etc.)
- Add service tier cost keys to custom pricing docs
- Add provider-specific cost tracking note to spend tracking overview

Made-with: Cursor

* fix: normalize response images missing index + guard audio duration overflow

1. convert_dict_to_response.py (#22640): Providers like OpenRouter/Gemini
   return images without the required `index` field, causing pydantic
   ValidationError when constructing Message. Added _normalize_images()
   to backfill index from enumeration position.

2. audio_utils/utils.py (#22622): libsndfile can report 2^63-1 frames
   for malformed audio files, causing astronomically large duration values
   used for cost calculation. Added guards for sentinel frame counts and
   implausible durations (>24h).

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix: add type annotations to _normalize_images + guard samplerate==0

Address review feedback:
- Add type hints to _normalize_images() for consistency with codebase
- Guard against samplerate <= 0 to prevent ZeroDivisionError on
  malformed audio files

Co-Authored-By: claude-flow <ruv@ruv.net>

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Ryan Crabbe <rcrabbe@berkeley.edu>
Co-authored-by: ryan-crabbe <128659760+ryan-crabbe@users.noreply.github.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Sameer Kankute <sameer@berri.ai>
Co-authored-by: claude-flow <ruv@ruv.net>
@vercel
Copy link

vercel bot commented Mar 6, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Error Error Mar 10, 2026 11:54am

Request Review

@CLAassistant
Copy link

CLAassistant commented Mar 6, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
5 out of 8 committers have signed the CLA.

✅ xykong
✅ eliasto
✅ santoshkumarradha
✅ Sameerlite
✅ shaeberling
❌ SabaPivot
❌ ocervell
❌ Giulio Leone


Giulio Leone seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 6, 2026

Greptile Summary

This PR bundles several independent bug fixes: audio duration overflow guards, Anthropic empty-text-block sanitization for the /v1/messages path, a new OVHCloud Responses API config, a base_model provider-prefix alignment fix in the cost calculator, a vector_store_ids passthrough blocklist, and an Ollama get_model_info URL malformation fix.

Key findings:

  • Dead code in the stated primary fix: The PR description highlights _normalize_images() as a core bug fix, but the new function is never called anywhere. The actual fix is the pre-existing _normalize_images_for_message() which was already wired to its call site before this PR. _normalize_images() should be removed or explicitly connected to a call site.
  • Duplicate import in litellm/proxy/proxy_server.py: jwt_key_mapping_router is imported twice from the same module (lines 377–382), which is a lint error.
  • OVHCloud debug log fires both when model info is unavailable and when the model is confirmed not to support function calling, emitting a potentially confusing "check the catalog" hint in the latter case.
  • All functional fixes (audio overflow, Anthropic message sanitization, cost calculator provider alignment, Ollama URL fix, OVHCloud integration) are correct and well-scoped.

Confidence Score: 4/5

  • Safe to merge with minor cleanup — one dead-code function and a duplicate import should be resolved first.
  • All functional fixes (audio overflow, Anthropic message sanitization, cost calculator provider alignment, Ollama URL fix, OVHCloud integration) are correct and well-tested. The main concern is that _normalize_images() — described as a primary bug fix — is unreachable dead code, which undermines confidence in the stated fix. The duplicate import in proxy_server.py is harmless but untidy. The OVHCloud debug log issue is a minor refinement. No regressions or security issues found.
  • litellm/litellm_core_utils/llm_response_utils/convert_dict_to_response.py — _normalize_images is dead code; litellm/proxy/proxy_server.py — duplicate import of jwt_key_mapping_router

Last reviewed commit: a5e10f9

ocervell and others added 2 commits March 5, 2026 20:40
…22910)

Fix malformed URL construction in OllamaConfig.get_model_info() when
api_base already contains an endpoint path (e.g., /api/generate).

Before this fix, when api_base was passed with an endpoint path already
appended (which happens through the completion flow via get_complete_url()),
get_model_info() would naively append /api/show, resulting in URLs like:
http://server:11434/api/generate/api/show (404 error)

This fix strips known endpoint paths (/api/generate, /api/chat, /api/embed)
before appending /api/show, following the same defensive pattern used
elsewhere in the codebase (e.g., embedding handler, chat transformation).

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a58d859217

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


{/* Sidebar nav buttons */}
<div style={{ padding: "0 8px 4px", flexShrink: 0 }}>
{sidebarNavItem(<EditOutlined />, "New chat", () => router.push("/chat"))}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Route chat navigation through UI base path

This uses an absolute "/chat" path, which drops the dashboard base prefix used in proxy deployments (/ui, plus optional server_root_path). In production (NEXT_PUBLIC_BASE_URL="ui/"), this can navigate users outside the mounted UI and produce 404s or non-dashboard routes when starting a new chat or recovering from stale IDs; these navigations should be built with the same base-path helper used elsewhere (/ui/chat with server root prefix).

Useful? React with 👍 / 👎.

// Toggle ON — verify tools are reachable first
setTogglingOn((prev) => new Set(prev).add(serverName));
try {
const result = await listMCPTools(accessToken, serverName);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Validate MCP servers using server_id, not display name

The tool-validation call passes serverName (derived from server_name/alias) into listMCPTools, but that endpoint is queried as ?server_id=... and access checks are done against allowed server IDs. When display name differs from server_id, toggle-on will incorrectly fail with "could not load tools" and the server cannot be connected for chat. Use server.server_id for the validation call and keep display names only for UI labels.

Useful? React with 👍 / 👎.

eliasto and others added 3 commits March 5, 2026 20:42
* feat(ovhcloud): Add support of responses API

* Edit from greptile recommendation

* fix support of tool calling
* docs: add AgentField to integrations index

* docs: add AgentField tutorial page

* docs: add AgentField to Agent SDKs sidebar
Comment on lines +266 to +275
frames = len(audio)
# Guard against sentinel/invalid frame counts (e.g., 2^63-1 from libsndfile)
if frames <= 0 or frames >= 2**63 - 1:
return None
if audio.samplerate <= 0:
return None
duration = frames / audio.samplerate
# Reject implausible durations (> 24 hours)
if duration > 86400:
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No unit tests for the sentinel guard and samplerate fix

The two new guards (frames >= 2**63 - 1 and samplerate <= 0) are described as the primary bug fix in the PR title, yet no corresponding test was added in tests/test_litellm/litellm_core_utils/test_audio_utils.py. The project's own pre-submission checklist states "Adding at least 1 test is a hard requirement". Without tests it's difficult to verify the sentinel-value path returns None and that the samplerate guard prevents a ZeroDivisionError.

Comment on lines +48 to +59
def _normalize_images(
images: Optional[List[Dict[str, object]]],
) -> Optional[List[Dict[str, object]]]:
"""Normalize image items to include required 'index' field if missing."""
if images is None:
return None
normalized: List[Dict[str, object]] = []
for i, img in enumerate(images):
if isinstance(img, dict) and "index" not in img:
img = {**img, "index": i}
normalized.append(img)
return normalized
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No unit tests for _normalize_images

This is the other primary bug fix in the PR title (OpenRouter/Gemini images missing index causing pydantic ValidationError), yet no test was added for _normalize_images in tests/test_litellm/. A simple test asserting that:

  • a list of dicts without index gets sequential indices backfilled
  • a list of dicts already having index is returned unchanged
  • None is returned as-is

…would greatly increase confidence in the fix and protect against regressions.

Comment on lines +635 to +638
from litellm.constants import (
LITELLM_TRUNCATED_PAYLOAD_FIELD,
LITELLM_TRUNCATION_DB_SAFEGUARD_NOTE,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant local import inside function body

LITELLM_TRUNCATED_PAYLOAD_FIELD and LITELLM_TRUNCATION_DB_SAFEGUARD_NOTE are now imported at module level (lines 14–17) and again inside _sanitize_request_body_for_spend_logs_payload. The inner import is harmless (Python deduplicates module imports), but it's inconsistent and clutters the function. The local import can be removed.

Suggested change
from litellm.constants import (
LITELLM_TRUNCATED_PAYLOAD_FIELD,
LITELLM_TRUNCATION_DB_SAFEGUARD_NOTE,
)
if visited is None:

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

…22933)

The `/v1/messages` endpoint does not sanitize empty text content blocks
from messages, causing 400 errors ("text content blocks must be
non-empty") in multi-turn tool-use conversations.

Claude's API returns assistant messages with empty text blocks alongside
`tool_use` blocks (e.g., `{"type": "text", "text": ""}`). While the API
returns these, it rejects them when sent back in subsequent requests.

The `/v1/chat/completions` path already handles this via
`process_empty_text_blocks()` in `factory.py`, but the `/v1/messages`
path was missing equivalent sanitization.

Fix: Add `_sanitize_anthropic_messages()` at the entry point of the
`/v1/messages` handler, before any routing. This strips empty/whitespace
text blocks from message content arrays while preserving non-empty text
and other block types (`tool_use`, `tool_result`, etc.). The function
does not leave content arrays empty to avoid a different validation error.

This affects all providers routed through `/v1/messages` (Anthropic
direct, Bedrock Converse, Azure, etc.) and is particularly important for
Claude Code / Agent SDK conversations which frequently produce these
patterns.

Fixes #22930
Comment on lines +75 to +102
if not isinstance(content, list):
continue

# Filter out empty text blocks, keeping non-empty text and other types.
# Use `(... or "")` to guard against None text values.
filtered = [
block
for block in content
if not (
isinstance(block, dict)
and block.get("type") == "text"
and not (block.get("text") or "").strip()
)
]

# Only update if we actually removed something.
# Avoid mutating the caller's dicts — create a shallow copy.
if len(filtered) < len(content):
if len(filtered) > 0:
messages[i] = {**message, "content": filtered}
else:
# All blocks were empty text — replace with a continuation
# message rather than leaving empty blocks that trigger 400
# errors. Matches behavior of process_empty_text_blocks()
# in factory.py.
messages[i] = {
**message,
"content": [{"type": "text", "text": DEFAULT_ASSISTANT_CONTINUE_MESSAGE.get("content", "Please continue.")}],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In-place mutation of the input messages list

The function docstring comment says "Avoid mutating the caller's dicts — create a shallow copy", but the function actually mutates the input messages list itself via messages[i] = .... Only the individual message dicts get shallow copies — the outer list is still modified in-place.

While each updated message dict is a new object ({**message, ...}), lines like:

messages[i] = {**message, "content": filtered}

…write back into the original list. Any caller that holds a reference to the same list object (e.g. for retry logic, logging, or span attributes) will observe the sanitised content unexpectedly.

A safe alternative is to build a new list rather than modifying in place:

def _sanitize_anthropic_messages(messages: List[Dict]) -> List[Dict]:
    result = []
    for message in messages:
        content = message.get("content")
        if not isinstance(content, list):
            result.append(message)
            continue
        filtered = [
            block
            for block in content
            if not (
                isinstance(block, dict)
                and block.get("type") == "text"
                and not (block.get("text") or "").strip()
            )
        ]
        if len(filtered) < len(content):
            if len(filtered) > 0:
                result.append({**message, "content": filtered})
            else:
                result.append({
                    **message,
                    "content": [{"type": "text", "text": DEFAULT_ASSISTANT_CONTINUE_MESSAGE.get("content", "Please continue.")}],
                })
        else:
            result.append(message)
    return result

Comment on lines 800 to +804
_request_body = _sanitize_request_body_for_spend_logs_payload(_request_body)
_request_body_json_str = json.dumps(_request_body, default=str)
if LITELLM_TRUNCATED_PAYLOAD_FIELD in _request_body_json_str:
verbose_proxy_logger.info(
"Spend Log: request body was truncated before storing in DB. %s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Substring check can produce false-positive truncation log entries

LITELLM_TRUNCATED_PAYLOAD_FIELD equals the string "litellm_truncated". The check:

if LITELLM_TRUNCATED_PAYLOAD_FIELD in _request_body_json_str:

…is a plain substring search across the entire serialised JSON. If any key or value in the request body legitimately contains the text "litellm_truncated" (e.g. a prompt that discusses LiteLLM internals, or a model name / metadata field), the info log will fire even though no actual truncation occurred. The same pattern appears in _get_response_for_spend_logs_payload (line ~887).

A more robust approach is to have _sanitize_request_body_for_spend_logs_payload return a flag or a lightweight wrapper indicating whether truncation happened, and base the log decision on that flag rather than a substring scan of the full JSON string.

from litellm.types.router import GenericLiteLLMParams
from litellm.types.utils import LlmProviders
from litellm.utils import get_model_info

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing blank lines before class definition

PEP 8 requires two blank lines before a top-level class definition. There is currently zero blank lines between the last import and class OVHCloudResponsesAPIConfig. The same issue exists in convert_dict_to_response.py where _normalize_images is separated from the preceding import by only one blank line.

Suggested change
from litellm.utils import get_model_info
class OVHCloudResponsesAPIConfig(OpenAIResponsesAPIConfig):

…ent provider prefix (#22906)

* fix(cost_calc): update custom_llm_provider when base_model has different provider

When base_model carries a provider prefix that differs from the
deployment provider (e.g. base_model='gemini/gemini-2.0-flash' on an
anthropic/ deployment), the custom_llm_provider was not updated,
causing cost_per_token to build an invalid model key and return 0.

After _select_model_name_for_cost_calc resolves the model name from
base_model, extract the provider prefix and update custom_llm_provider
so the downstream cost lookup uses the correct provider.

Fixes #22257

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(cost_calc): gate base_model provider override on custom_pricing and diff check

- Skip custom_llm_provider override when custom_pricing=True (base_model unused)
- Only override when extracted provider differs from current custom_llm_provider
- Add direct completion_cost unit test for cross-provider base_model
- Add same-provider no-regression test (e.g. openai/gpt-4o on openai deployment)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: guard hidden_params override when provider already overridden by base_model

- Add _provider_overridden flag to prevent hidden_params from undoing base_model fix
- Add direct unit test verifying hidden_params doesn't override extracted provider

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: gate reasoning_auto_summary injection on supports_reasoning()

When reasoning_auto_summary is globally enabled, the reasoning param
was injected unconditionally for all models including non-reasoning
ones (e.g. gpt-4o-mini), causing OpenAI API errors. Now gated on
supports_reasoning(model, custom_llm_provider) check.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: revert unrelated transformation.py changes causing TypeError

The transformation.py changes pass model= and custom_llm_provider=
kwargs to _map_optional_params_to_responses_api_request() which only
accepts (self, optional_params, responses_api_request) — causing a
TypeError at runtime for every Responses API request.

Reverted to upstream version; cost_calculator.py fix is unaffected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: giulio-leone <6887247+giulio-leone@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
setCopied(true);
setTimeout(() => setCopied(false), 2000);
}).catch(() => {
message.error("Failed to copy to clipboard");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

message is not imported from antd — this will cause a ReferenceError at runtime when navigator.clipboard.writeText fails (e.g., in non-HTTPS contexts or with restricted clipboard permissions).

Add message to the antd import:

Suggested change
message.error("Failed to copy to clipboard");
import { Collapse, Tooltip, message } from "antd";

Alternatively, handle the error with console.error() instead.

Comment on lines +93 to +104
if len(filtered) > 0:
messages[i] = {**message, "content": filtered}
else:
# All blocks were empty text — replace with a continuation
# message rather than leaving empty blocks that trigger 400
# errors. Matches behavior of process_empty_text_blocks()
# in factory.py.
messages[i] = {
**message,
"content": [{"type": "text", "text": DEFAULT_ASSISTANT_CONTINUE_MESSAGE.get("content", "Please continue.")}],
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function mutates the input messages list in-place via messages[i] = ..., despite the comment at line 91 suggesting immutability ("Avoid mutating the caller's dicts"). While each dict is shallow-copied, the list structure itself is modified, which violates the documented contract.

To ensure immutability and prevent bugs in callers that hold references to the original list, build a new list instead:

def _sanitize_anthropic_messages(messages: List[Dict]) -> List[Dict]:
    """..."""
    result = []
    for message in messages:
        content = message.get("content")
        if not isinstance(content, list):
            result.append(message)
            continue

        filtered = [
            block
            for block in content
            if not (
                isinstance(block, dict)
                and block.get("type") == "text"
                and not (block.get("text") or "").strip()
            )
        ]

        if len(filtered) < len(content):
            if len(filtered) > 0:
                result.append({**message, "content": filtered})
            else:
                result.append({
                    **message,
                    "content": [{"type": "text", "text": DEFAULT_ASSISTANT_CONTINUE_MESSAGE.get("content", "Please continue.")}],
                })
        else:
            result.append(message)
    
    return result

The call site already reassigns (messages = _sanitize_anthropic_messages(messages)), so this change is safe.

Comment on lines +49 to +60
def _normalize_images(
images: Optional[List[Dict[str, object]]],
) -> Optional[List[Dict[str, object]]]:
"""Normalize image items to include required 'index' field if missing."""
if images is None:
return None
normalized: List[Dict[str, object]] = []
for i, img in enumerate(images):
if isinstance(img, dict) and "index" not in img:
img = {**img, "index": i}
normalized.append(img)
return normalized
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_normalize_images is dead code — never called

This function was added by the PR but is never invoked anywhere in the codebase. A grep of the whole repo confirms the only occurrence is its own definition at line 49.

The actual image-index fix is handled by the pre-existing _normalize_images_for_message() (lines 70–85), which was already present before this PR and is already called at line 627:

images=_normalize_images_for_message(
    choice["message"].get("images", None)
),

The two functions are nearly identical in behavior. Because _normalize_images is never reached by any call path, the stated bug fix is entirely carried by the pre-existing helper — not by this new function. Either remove this function or wire it up to an actual call site.

Comment on lines 377 to +382
from litellm.proxy.management_endpoints.jwt_key_mapping_endpoints import (
router as jwt_key_mapping_router,
)
from litellm.proxy.management_endpoints.jwt_key_mapping_endpoints import (
router as jwt_key_mapping_router,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate import of jwt_key_mapping_router

Lines 377–379 and 380–382 import the same name from the same module twice. This is harmless at runtime (Python deduplicates module imports) but is a lint error and adds noise.

Remove the second import block (lines 380–382); the first import at lines 377–379 is sufficient.

Comment on lines +40 to +59
supports_function_calling: Optional[bool] = None
try:
model_info = get_model_info(model, custom_llm_provider="ovhcloud")
supports_function_calling = model_info.get(
"supports_function_calling", False
)
except Exception as e:
verbose_logger.debug(f"Error getting supported OpenAI params: {e}")
pass

if supports_function_calling is not True:
verbose_logger.debug(
"You can see our models supporting function_calling in our catalog: https://www.ovhcloud.com/en/public-cloud/ai-endpoints/catalog/ "
)
# Remove tool-related params for models that don't support function calling
for param in ("tools", "tool_choice"):
if param in supported_params:
supported_params.remove(param)

return supported_params
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug log fires for both unknown and confirmed non-function-calling models

supports_function_calling is initialised to None. When get_model_info succeeds and returns False (model is confirmed not to support function calling), the condition at line 50 (if supports_function_calling is not True) evaluates to True, and the debug message is emitted.

This means the message is logged in two scenarios:

  1. When get_model_info raised an exception (supports_function_calling is still None) — warrants the "check the catalog" hint
  2. When the model is confirmed not to support function calling (value is False) — expected behavior with no actionable guidance needed

Only case 1 should emit the hint. Consider separating the logic:

if supports_function_calling is None:
    verbose_logger.debug(
        "Could not determine function calling support; "
        "see our catalog: https://www.ovhcloud.com/en/public-cloud/ai-endpoints/catalog/"
    )

if supports_function_calling is not True:
    for param in ("tools", "tool_choice"):
        if param in supported_params:
            supported_params.remove(param)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants